162 research outputs found
Understanding confounding effects in linguistic coordination: an information-theoretic approach
We suggest an information-theoretic approach for measuring stylistic
coordination in dialogues. The proposed measure has a simple predictive
interpretation and can account for various confounding factors through proper
conditioning. We revisit some of the previous studies that reported strong
signatures of stylistic accommodation, and find that a significant part of the
observed coordination can be attributed to a simple confounding effect - length
coordination. Specifically, longer utterances tend to be followed by longer
responses, which gives rise to spurious correlations in the other stylistic
features. We propose a test to distinguish correlations in length due to
contextual factors (topic of conversation, user verbosity, etc.) and
turn-by-turn coordination. We also suggest a test to identify whether stylistic
coordination persists even after accounting for length coordination and
contextual factors
Efficient Estimation of Mutual Information for Strongly Dependent Variables
We demonstrate that a popular class of nonparametric mutual information (MI)
estimators based on k-nearest-neighbor graphs requires number of samples that
scales exponentially with the true MI. Consequently, accurate estimation of MI
between two strongly dependent variables is possible only for prohibitively
large sample size. This important yet overlooked shortcoming of the existing
estimators is due to their implicit reliance on local uniformity of the
underlying joint distribution. We introduce a new estimator that is robust to
local non-uniformity, works well with limited data, and is able to capture
relationship strengths over many orders of magnitude. We demonstrate the
superior performance of the proposed estimator on both synthetic and real-world
data.Comment: 13 pages, to appear in International Conference on Artificial
Intelligence and Statistics (AISTATS) 201
Kernelized Hashcode Representations for Relation Extraction
Kernel methods have produced state-of-the-art results for a number of NLP
tasks such as relation extraction, but suffer from poor scalability due to the
high cost of computing kernel similarities between natural language structures.
A recently proposed technique, kernelized locality-sensitive hashing (KLSH),
can significantly reduce the computational cost, but is only applicable to
classifiers operating on kNN graphs. Here we propose to use random subspaces of
KLSH codes for efficiently constructing an explicit representation of NLP
structures suitable for general classification methods. Further, we propose an
approach for optimizing the KLSH model for classification problems by
maximizing an approximation of mutual information between the KLSH codes
(feature vectors) and the class labels. We evaluate the proposed approach on
biomedical relation extraction datasets, and observe significant and robust
improvements in accuracy w.r.t. state-of-the-art classifiers, along with
drastic (orders-of-magnitude) speedup compared to conventional kernel methods.Comment: To appear in the proceedings of conference, AAAI-1
Dialog State Tracking: A Neural Reading Comprehension Approach
Dialog state tracking is used to estimate the current belief state of a
dialog given all the preceding conversation. Machine reading comprehension, on
the other hand, focuses on building systems that read passages of text and
answer questions that require some understanding of passages. We formulate
dialog state tracking as a reading comprehension task to answer the question
after reading conversational
context. In contrast to traditional state tracking methods where the dialog
state is often predicted as a distribution over a closed set of all the
possible slot values within an ontology, our method uses a simple
attention-based neural network to point to the slot values within the
conversation. Experiments on MultiWOZ-2.0 cross-domain dialog dataset show that
our simple system can obtain similar accuracies compared to the previous more
complex methods. By exploiting recent advances in contextual word embeddings,
adding a model that explicitly tracks whether a slot value should be carried
over to the next turn, and combining our method with a traditional joint state
tracking method that relies on closed set vocabulary, we can obtain a
joint-goal accuracy of on the standard test split, exceeding current
state-of-the-art by **.Comment: 10 pages, to appear in Special Interest Group on Discourse and
Dialogue (SIGDIAL) 2019 (ORAL
CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action Architecture
Enabling robots to effectively imitate expert skills in longhorizon tasks
such as locomotion, manipulation, and more, poses a long-standing challenge.
Existing imitation learning (IL) approaches for robots still grapple with
sub-optimal performance in complex tasks. In this paper, we consider how this
challenge can be addressed within the human cognitive priors. Heuristically, we
extend the usual notion of action to a dual Cognition (high-level)-Action
(low-level) architecture by introducing intuitive human cognitive priors, and
propose a novel skill IL framework through human-robot interaction, called
Cognition-Action-based Skill Imitation Learning (CasIL), for the robotic agent
to effectively cognize and imitate the critical skills from raw visual
demonstrations. CasIL enables both cognition and action imitation, while
high-level skill cognition explicitly guides low-level primitive actions,
providing robustness and reliability to the entire skill IL process. We
evaluated our method on MuJoCo and RLBench benchmarks, as well as on the
obstacle avoidance and point-goal navigation tasks for quadrupedal robot
locomotion. Experimental results show that our CasIL consistently achieves
competitive and robust skill imitation capability compared to other
counterparts in a variety of long-horizon robotic tasks
- …